Crossing the finish line faster when paddling the Data Lake with Kayak
نویسندگان
چکیده
Paddling in a data lake is strenuous for a data scientist. Being a loosely-structured collection of raw data with little or no meta-information available, the difficulties of extracting insights from a data lake start from the initial phases of data analysis. Indeed, data preparation, which involves many complex operations (such as source and feature selection, exploratory analysis, data profiling, and data curation), is a long and involved activity for navigating the lake before getting precious insights at the finish line. In this framework, we demonstrate kayak, a framework that supports data preparation in a data lake with ad-hoc primitives and allows data scientists to cross the finish line sooner. kayak takes into account the tolerance of the user in waiting for the primitives’ results and it uses incremental execution strategies to produce informative previews of these results. The framework is based on a wise management of metadata and on features that limit human intervention, thus scaling smoothly when the data lake evolves.
منابع مشابه
The Effect of Seat Type on Stroke Kinematics and Trunk Rotator Activity during Kayak Ergometer Paddling
...................................................................................................................... iii Word Count ................................................................................................................. iii
متن کاملComparing Marginal Fit and Marginal Gap between Sholder Bevel and 135° Finish Line Designs in Posterior Metal Ceramic Restoration Crowns
Background and purpose: One of the major challenges for dentists is providing the patients with a crown of appropriate marginal fit and marginal gap. Preparation of a 135° finish line has some advantages such as technical ease and appropriate finish line record. Nevertheless, few studies investigated this type of finish line. The current study aimed at comparing marginal fit and marginal gap in...
متن کاملInfluence of Dynamic Neuromuscular Stabilization Approach on Maximum Kayak Paddling Force
The purpose of this study was to examine the effect of Dynamic Neuromuscular Stabilization (DNS) exercise on maximum paddling force (PF) and self-reported pain perception in the shoulder girdle area in flatwater kayakers. Twenty male flatwater kayakers from a local club (age = 21.9 ± 2.4 years, body height = 185.1 ± 7.9 cm, body mass = 83.9 ± 9.1 kg) were randomly assigned to the intervention o...
متن کاملPost-season detraining effects on physiological and performance parameters in top-level kayakers: comparison of two recovery strategies.
This study analyzed changes in physiological parameters, hormonal markers and kayaking performance following 5-wk of reduced training (RT) or complete training cessation (TC). Fourteen top-level male kayakers were randomly assigned to either a TC (n = 7) or RT group (n = 7) at the end of their competitive season (T1). Subjects undertook blood sampling and an incremental test to exhaustion on a ...
متن کاملDehydration rates and rehydration efficacy of water and sports drink during one hour of moderate intensity exercise in well-trained flatwater kayakers.
INTRODUCTION The aim of this study is to investigate the amount of water loss and percentage dehydration experienced during 1 hour of paddling on the kayak ergometer so as to help coaches and athletes tailor a suitable and adequate rehydration regime. Also, rehydration efficacy between water and a well established, commercially available sports drink (Gatorade, Quaker Oats company, USA) was inv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 10 شماره
صفحات -
تاریخ انتشار 2017